Leveling Jupyter notebook skills

使用 Jupyter 的一些技巧. 主要是学习自 Gerrit Gruben 的视频. 后面如果有新的技能会陆续更新

Ref:

1. Simple start to Jupyter Notebook


In [2]:
!ls


BAMM.101x           Learn Jupyter.ipynb cookie.py           thinkbayes.py
Hash_Encryption     README.md           machine_learning    学术画像
LICENSE             __pycache__         structure&algorithm 线程进程

In [6]:
# %load cookie.py
"""This file contains code for use with "Think Bayes",
by Allen B. Downey, available from greenteapress.com

Copyright 2012 Allen B. Downey
License: GNU GPLv3 http://www.gnu.org/licenses/gpl.html
"""

from thinkbayes import Pmf

pmf = Pmf()
pmf.Set('Bowl 1', 0.5)
pmf.Set('Bowl 2', 0.5)

pmf.Mult('Bowl 1', 0.75)
pmf.Mult('Bowl 2', 0.5)

pmf.Normalize()

print((pmf.Prob('Bowl 1')))


0.6000000000000001

In [10]:
!pip install version_information


Collecting version_information
  Downloading version_information-1.0.3.tar.gz
Building wheels for collected packages: version-information
  Running setup.py bdist_wheel for version-information ... done
  Stored in directory: /Users/Beck/Library/Caches/pip/wheels/4b/4c/f7/4d99d7820a507d8ae55204fcc00d66cdabf596d4b01228e7bd
Successfully built version-information
Installing collected packages: version-information
Successfully installed version-information-1.0.3

In [12]:
%load_ext version_information
%version_information pandas, sklearn


The version_information extension is already loaded. To reload it, use:
  %reload_ext version_information
Out[12]:
SoftwareVersion
Python3.5.2 64bit [GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)]
IPython6.2.0
OSDarwin 15.6.0 x86_64 i386 64bit
pandas0.20.3
sklearn0.19.0
Wed Sep 20 13:50:44 2017 CST

In [13]:
%whos


Variable   Type    Data/Info
----------------------------
Pmf        type    <class 'thinkbayes.Pmf'>
pmf        Pmf     <thinkbayes.Pmf object at 0x1073309b0>

In [14]:
# 自动刷新引用
%load_ext autoreload
%audoreload 2
%aimport cookie


UsageError: Line magic function `%audoreload` not found.

In [18]:
# 用于展示引用文件
from IPython.display import FileLink
FileLink("cookie.py")


Out[18]:

In [28]:
import numpy as np
import sklearn 
import pandas
%matplotlib inline
import matplotlib.pyplot as plt

In [22]:
from sklearn.datasets import load_boston

TAB 自动提示

Shift+TAB 参数提示


In [1]:
# 在括号内按 shitf + tab
load_boston()


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-1-2fcfa8cd1960> in <module>()
      1 # 在括号内按 shitf + tab
----> 2 load_boston()

NameError: name 'load_boston' is not defined

? 可以显示文档


In [25]:
from sklearn.ensemble import RandomForestRegressor
RandomForestRegressor?

In [33]:
%%latex
$$ x^3 + C = \int{\frac{1}{3} x^2 \; dx} \quad (C \in \mathbb{R})$$


$$ x^3 + C = \int{\frac{1}{3} x^2 \; dx} \quad (C \in \mathbb{R})$$

In [2]:
import sklearn 
import numpy as np
import pandas as pd
%matplotlib inline
import matplotlib.pyplot as plt

In [3]:
from sklearn.datasets import california_housing
cal = california_housing.fetch_california_housing()
df = pd.DataFrame(data=cal.data, columns=cal.feature_names, index=cal.target)
df.head(10)


Downloading Cal. housing from https://ndownloader.figshare.com/files/5976036 to /Users/Beck/scikit_learn_data
Out[3]:
MedInc HouseAge AveRooms AveBedrms Population AveOccup Latitude Longitude
4.526 8.3252 41.0 6.984127 1.023810 322.0 2.555556 37.88 -122.23
3.585 8.3014 21.0 6.238137 0.971880 2401.0 2.109842 37.86 -122.22
3.521 7.2574 52.0 8.288136 1.073446 496.0 2.802260 37.85 -122.24
3.413 5.6431 52.0 5.817352 1.073059 558.0 2.547945 37.85 -122.25
3.422 3.8462 52.0 6.281853 1.081081 565.0 2.181467 37.85 -122.25
2.697 4.0368 52.0 4.761658 1.103627 413.0 2.139896 37.85 -122.25
2.992 3.6591 52.0 4.931907 0.951362 1094.0 2.128405 37.84 -122.25
2.414 3.1200 52.0 4.797527 1.061824 1157.0 1.788253 37.84 -122.25
2.267 2.0804 42.0 4.294118 1.117647 1206.0 2.026891 37.84 -122.26
2.611 3.6912 52.0 4.970588 0.990196 1551.0 2.172269 37.84 -122.25

In [4]:
plt.scatter(df.Latitude, df.Longitude)


Out[4]:
<matplotlib.collections.PathCollection at 0x11005fc88>

In [5]:
import seaborn as sns

In [7]:
sns.jointplot(df.Longitude, df.Latitude)


Out[7]:
<seaborn.axisgrid.JointGrid at 0x1138c2b38>

In [9]:
sns.set(style="ticks")

sns.jointplot(df.Longitude, df.Latitude, kind="hex", color="#4CB391")


Out[9]:
<seaborn.axisgrid.JointGrid at 0x11b12ea20>

In [13]:
from scipy.stats import kendalltau
kendalltau?

In [14]:
sns.set(style="ticks")

sns.jointplot(df.Longitude, df.Latitude, kind="hex",stat_func=kendalltau, color="#4CB391")


Out[14]:
<seaborn.axisgrid.JointGrid at 0x11be1d9b0>

In [ ]: